The search functionality is under construction.

Keyword Search Result

[Keyword] neural networks(287hit)

81-100hit(287hit)

  • Deep Neural Network Based Monaural Speech Enhancement with Low-Rank Analysis and Speech Present Probability

    Wenhua SHI  Xiongwei ZHANG  Xia ZOU  Meng SUN  Wei HAN  Li LI  Gang MIN  

     
    LETTER-Noise and Vibration

      Vol:
    E101-A No:3
      Page(s):
    585-589

    A monaural speech enhancement method combining deep neural network (DNN) with low rank analysis and speech present probability is proposed in this letter. Low rank and sparse analysis is first applied on the noisy speech spectrogram to get the approximate low rank representation of noise. Then a joint feature training strategy for DNN based speech enhancement is presented, which helps the DNN better predict the target speech. To reduce the residual noise in highly overlapping regions and high frequency domain, speech present probability (SPP) weighted post-processing is employed to further improve the quality of the speech enhanced by trained DNN model. Compared with the supervised non-negative matrix factorization (NMF) and the conventional DNN method, the proposed method obtains improved speech enhancement performance under stationary and non-stationary conditions.

  • Corpus Expansion for Neural CWS on Microblog-Oriented Data with λ-Active Learning Approach

    Jing ZHANG  Degen HUANG  Kaiyu HUANG  Zhuang LIU  Fuji REN  

     
    PAPER-Natural Language Processing

      Pubricized:
    2017/12/08
      Vol:
    E101-D No:3
      Page(s):
    778-785

    Microblog data contains rich information of real-world events with great commercial values, so microblog-oriented natural language processing (NLP) tasks have grabbed considerable attention of researchers. However, the performance of microblog-oriented Chinese Word Segmentation (CWS) based on deep neural networks (DNNs) is still not satisfying. One critical reason is that the existing microblog-oriented training corpus is inadequate to train effective weight matrices for DNNs. In this paper, we propose a novel active learning method to extend the scale of the training corpus for DNNs. However, due to a large amount of partially overlapped sentences in the microblogs, it is difficult to select samples with high annotation values from raw microblogs during the active learning procedure. To select samples with higher annotation values, parameter λ is introduced to control the number of repeatedly selected samples. Meanwhile, various strategies are adopted to measure the overall annotation values of a sample during the active learning procedure. Experiments on the benchmark datasets of NLPCC 2015 show that our λ-active learning method outperforms the baseline system and the state-of-the-art method. Besides, the results also demonstrate that the performances of the DNNs trained on the extended corpus are significantly improved.

  • End-to-End Exposure Fusion Using Convolutional Neural Network

    Jinhua WANG  Weiqiang WANG  Guangmei XU  Hongzhe LIU  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2017/11/22
      Vol:
    E101-D No:2
      Page(s):
    560-563

    In this paper, we describe the direct learning of an end-to-end mapping between under-/over-exposed images and well-exposed images. The mapping is represented as a deep convolutional neural network (CNN) that takes multiple-exposure images as input and outputs a high-quality image. Our CNN has a lightweight structure, yet gives state-of-the-art fusion quality. Furthermore, we know that for a given pixel, the influence of the surrounding pixels gradually increases as the distance decreases. If the only pixels considered are those in the convolution kernel neighborhood, the final result will be affected. To overcome this problem, the size of the convolution kernel is often increased. However, this also increases the complexity of the network (too many parameters) and the training time. In this paper, we present a method in which a number of sub-images of the source image are obtained using the same CNN model, providing more neighborhood information for the convolution operation. Experimental results demonstrate that the proposed method achieves better performance in terms of both objective evaluation and visual quality.

  • Daily Activity Recognition with Large-Scaled Real-Life Recording Datasets Based on Deep Neural Network Using Multi-Modal Signals

    Tomoki HAYASHI  Masafumi NISHIDA  Norihide KITAOKA  Tomoki TODA  Kazuya TAKEDA  

     
    PAPER-Engineering Acoustics

      Vol:
    E101-A No:1
      Page(s):
    199-210

    In this study, toward the development of smartphone-based monitoring system for life logging, we collect over 1,400 hours of data by recording including both the outdoor and indoor daily activities of 19 subjects, under practical conditions with a smartphone and a small camera. We then construct a huge human activity database which consists of an environmental sound signal, triaxial acceleration signals and manually annotated activity tags. Using our constructed database, we evaluate the activity recognition performance of deep neural networks (DNNs), which have achieved great performance in various fields, and apply DNN-based adaptation techniques to improve the performance with only a small amount of subject-specific training data. We experimentally demonstrate that; 1) the use of multi-modal signal, including environmental sound and triaxial acceleration signals with a DNN is effective for the improvement of activity recognition performance, 2) the DNN can discriminate specified activities from a mixture of ambiguous activities, and 3) DNN-based adaptation methods are effective even if only a small amount of subject-specific training data is available.

  • Feature Adaptive Correlation Tracking

    Yulong XU  Yang LI  Jiabao WANG  Zhuang MIAO  Hang LI  Yafei ZHANG  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2016/11/28
      Vol:
    E100-D No:3
      Page(s):
    594-597

    Feature extractor plays an important role in visual tracking, but most state-of-the-art methods employ the same feature representation in all scenes. Taking into account the diverseness, a tracker should choose different features according to the videos. In this work, we propose a novel feature adaptive correlation tracker, which decomposes the tracking task into translation and scale estimation. According to the luminance of the target, our approach automatically selects either hierarchical convolutional features or histogram of oriented gradient features in translation for varied scenarios. Furthermore, we employ a discriminative correlation filter to handle scale variations. Extensive experiments are performed on a large-scale benchmark challenging dataset. And the results show that the proposed algorithm outperforms state-of-the-art trackers in accuracy and robustness.

  • An Improved Supervised Speech Separation Method Based on Perceptual Weighted Deep Recurrent Neural Networks

    Wei HAN  Xiongwei ZHANG  Meng SUN  Li LI  Wenhua SHI  

     
    LETTER-Speech and Hearing

      Vol:
    E100-A No:2
      Page(s):
    718-721

    In this letter, we propose a novel speech separation method based on perceptual weighted deep recurrent neural network (DRNN) which incorporate the masking properties of the human auditory system. In supervised training stage, we firstly utilize the clean label speech of two different speakers to calculate two perceptual weighting matrices. Then, the obtained different perceptual weighting matrices are utilized to adjust the mean squared error between the network outputs and the reference features of both the two clean speech so that the two different speech can mask each other. Experimental results on TSP speech corpus demonstrate that the proposed speech separation approach can achieve significant improvements over the state-of-the-art methods when tested with different mixing cases.

  • Multi-Channel Convolutional Neural Networks for Image Super-Resolution

    Shinya OHTANI  Yu KATO  Nobutaka KUROKI  Tetsuya HIROSE  Masahiro NUMA  

     
    PAPER-IMAGE PROCESSING

      Vol:
    E100-A No:2
      Page(s):
    572-580

    This paper proposes image super-resolution techniques with multi-channel convolutional neural networks. In the proposed method, output pixels are classified into K×K groups depending on their coordinates. Those groups are generated from separate channels of a convolutional neural network (CNN). Finally, they are synthesized into a K×K magnified image. This architecture can enlarge images directly without bicubic interpolation. Experimental results of 2×2, 3×3, and 4×4 magnifications have shown that the average PSNR for the proposed method is about 0.2dB higher than that for the conventional SRCNN.

  • Joint Optimization of Perceptual Gain Function and Deep Neural Networks for Single-Channel Speech Enhancement

    Wei HAN  Xiongwei ZHANG  Gang MIN  Xingyu ZHOU  Meng SUN  

     
    LETTER-Noise and Vibration

      Vol:
    E100-A No:2
      Page(s):
    714-717

    In this letter, we explore joint optimization of perceptual gain function and deep neural networks (DNNs) for a single-channel speech enhancement task. A DNN architecture is proposed which incorporates the masking properties of the human auditory system to make the residual noise inaudible. This new DNN architecture directly trains a perceptual gain function which is used to estimate the magnitude spectrum of clean speech from noisy speech features. Experimental results demonstrate that the proposed speech enhancement approach can achieve significant improvements over the baselines when tested with TIMIT sentences corrupted by various types of noise, no matter whether the noise conditions are included in the training set or not.

  • Using a Single Dendritic Neuron to Forecast Tourist Arrivals to Japan

    Wei CHEN  Jian SUN  Shangce GAO  Jiu-Jun CHENG  Jiahai WANG  Yuki TODO  

     
    PAPER-Biocybernetics, Neurocomputing

      Pubricized:
    2016/10/18
      Vol:
    E100-D No:1
      Page(s):
    190-202

    With the fast growth of the international tourism industry, it has been a challenge to forecast the tourism demand in the international tourism market. Traditional forecasting methods usually suffer from the prediction accuracy problem due to the high volatility, irregular movements and non-stationarity of the tourist time series. In this study, a novel single dendritic neuron model (SDNM) is proposed to perform the tourism demand forecasting. First, we use a phase space reconstruction to analyze the characteristics of the tourism and reconstruct the time series into proper phase space points. Then, the maximum Lyapunov exponent is employed to identify the chaotic properties of time series which is used to determine the limit of prediction. Finally, we use SDNM to make a short-term prediction. Experimental results of the forecasting of the monthly foreign tourist arrivals to Japan indicate that the proposed SDNM is more efficient and accurate than other neural networks including the multi-layered perceptron, the neuro-fuzzy inference system, the Elman network, and the single multiplicative neuron model.

  • Global Hyperbolic Hopfield Neural Networks

    Masaki KOBAYASHI  

     
    PAPER-Nonlinear Problems

      Vol:
    E99-A No:12
      Page(s):
    2511-2516

    In recent years, applications of neural networks with Clifford algebra have become widespread. Hyperbolic numbers are useful Clifford algebra to deal with hyperbolic geometry. It is difficult when Hopfield neural network is extended to hyperbolic versions, though several models have been proposed. Multistate or continuous hyperbolic Hopfield neural networks are promising models. However, the connection weights and domain of activation function are limited to the right quadrant of hyperbolic plane, and the learning algorithms are restricted. In this work, the connection weights and activation function are extended to the entire hyperbolic plane. In addition, the energy is defined and it is proven that the energy does not increase.

  • Speeding up Deep Neural Networks in Speech Recognition with Piecewise Quantized Sigmoidal Activation Function

    Anhao XING  Qingwei ZHAO  Yonghong YAN  

     
    LETTER-Acoustic modeling

      Pubricized:
    2016/07/19
      Vol:
    E99-D No:10
      Page(s):
    2558-2561

    This paper proposes a new quantization framework on activation function of deep neural networks (DNN). We implement fixed-point DNN by quantizing the activations into powers-of-two integers. The costly multiplication operations in using DNN can be replaced with low-cost bit-shifts to massively save computations. Thus, applying DNN-based speech recognition on embedded systems becomes much easier. Experiments show that the proposed method leads to no performance degradation.

  • Steady-versus-Transient Plot for Analysis of Digital Maps

    Hiroki YAMAOKA  Toshimichi SAITO  

     
    PAPER-Nonlinear Problems

      Vol:
    E99-A No:10
      Page(s):
    1806-1812

    A digital map is a simple dynamical system that is related to various digital dynamical systems including cellular automata, dynamic binary neural networks, and digital spiking neurons. Depending on parameters and initial condition, the map can exhibit various periodic orbits and transient phenomena to them. In order to analyze the dynamics, we present two simple feature quantities. The first and second quantities characterize the plentifulness of the periodic phenomena and the deviation of the transient phenomena, respectively. Using the two feature quantities, we construct the steady-versus-transient plot that is useful in the visualization and consideration of various digital dynamical systems. As a first step, we demonstrate analysis results for an example of the digital maps based on analog bifurcating neuron models.

  • Speaker Adaptive Training Localizing Speaker Modules in DNN for Hybrid DNN-HMM Speech Recognizers

    Tsubasa OCHIAI  Shigeki MATSUDA  Hideyuki WATANABE  Xugang LU  Chiori HORI  Hisashi KAWAI  Shigeru KATAGIRI  

     
    PAPER-Acoustic modeling

      Pubricized:
    2016/07/19
      Vol:
    E99-D No:10
      Page(s):
    2431-2443

    Among various training concepts for speaker adaptation, Speaker Adaptive Training (SAT) has been successfully applied to a standard Hidden Markov Model (HMM) speech recognizer, whose state is associated with Gaussian Mixture Models (GMMs). On the other hand, focusing on the high discriminative power of Deep Neural Networks (DNNs), a new type of speech recognizer structure, which combines DNNs and HMMs, has been vigorously investigated in the speaker adaptation research field. Along these two lines, it is natural to conceive of further improvement to a DNN-HMM recognizer by employing the training concept of SAT. In this paper, we propose a novel speaker adaptation scheme that applies SAT to a DNN-HMM recognizer. Our SAT scheme allocates a Speaker Dependent (SD) module to one of the intermediate layers of DNN, treats its remaining layers as a Speaker Independent (SI) module, and jointly trains the SD and SI modules while switching the SD module in a speaker-by-speaker manner. We implement the scheme using a DNN-HMM recognizer, whose DNN has seven layers, and elaborate its utility over TED Talks corpus data. Our experimental results show that in the supervised adaptation scenario, our Speaker-Adapted (SA) SAT-based recognizer reduces the word error rate of the baseline SI recognizer and the lowest word error rate of the SA SI recognizer by 8.4% and 0.7%, respectively, and by 6.4% and 0.6% in the unsupervised adaptation scenario. The error reductions gained by our SA-SAT-based recognizers proved to be significant by statistical testing. The results also show that our SAT-based adaptation outperforms, regardless of the SD module layer selection, its counterpart SI-based adaptation, and that the inner layers of DNN seem more suitable for SD module allocation than the outer layers.

  • Design of Multilevel Hybrid Classifier with Variant Feature Sets for Intrusion Detection System

    Aslhan AKYOL  Mehmet HACIBEYOĞLU  Bekir KARLIK  

     
    PAPER-Information Network

      Pubricized:
    2016/04/05
      Vol:
    E99-D No:7
      Page(s):
    1810-1821

    With the increase of network components connected to the Internet, the need to ensure secure connectivity is becoming increasingly vital. Intrusion Detection Systems (IDSs) are one of the common security components that identify security violations. This paper proposes a novel multilevel hybrid classifier that uses different feature sets on each classifier. It presents the Discernibility Function based Feature Selection method and two classifiers involving multilayer perceptron (MLP) and decision tree (C4.5). Experiments are conducted on the KDD'99 Cup and ISCX datasets, and the proposal demonstrates better performance than individual classifiers and other proposed hybrid classifiers. The proposed method provides significant improvement in the detection rates of attack classes and Cost Per Example (CPE) which was the primary evaluation method in the KDD'99 Cup competition.

  • Food Image Recognition Using Covariance of Convolutional Layer Feature Maps

    Atsushi TATSUMA  Masaki AONO  

     
    LETTER-Image Recognition, Computer Vision

      Pubricized:
    2016/02/23
      Vol:
    E99-D No:6
      Page(s):
    1711-1715

    Recent studies have obtained superior performance in image recognition tasks by using, as an image representation, the fully connected layer activations of Convolutional Neural Networks (CNN) trained with various kinds of images. However, the CNN representation is not very suitable for fine-grained image recognition tasks involving food image recognition. For improving performance of the CNN representation in food image recognition, we propose a novel image representation that is comprised of the covariances of convolutional layer feature maps. In the experiment on the ETHZ Food-101 dataset, our method achieved 58.65% averaged accuracy, which outperforms the previous methods such as the Bag-of-Visual-Words Histogram, the Improved Fisher Vector, and CNN-SVM.

  • Development of an Estimation Model for Instantaneous Presence in Audio-Visual Content

    Kenji OZAWA  Shota TSUKAHARA  Yuichiro KINOSHITA  Masanori MORISE  

     
    PAPER

      Pubricized:
    2015/10/21
      Vol:
    E99-D No:1
      Page(s):
    120-127

    The sense of presence is often used to evaluate the performances of audio-visual (AV) content and systems. However, a presence meter has yet to be realized. We consider that the sense of presence can be divided into two aspects: system presence and content presence. In this study we focused on content presence. To estimate the overall presence of a content item, we have developed estimation models for the sense of presence in audio-only and audio-visual content. In this study, the audio-visual model is expanded to estimate the instantaneous presence in an AV content item. Initially, we conducted an evaluation experiment of the presence with 40 content items to investigate the relationship between the features of the AV content and the instantaneous presence. Based on the experimental data, a neural-network-based model was developed by expanding the previous model. To express the variation in instantaneous presence, 6 audio-related features and 14 visual-related features, which are extracted from the content items in 500-ms intervals, are used as inputs for the model. The audio-related features are loudness, sharpness, roughness, dynamic range and standard deviation in sound pressure levels, and movement of sound images. The visual-related features involve hue, lightness, saturation, and movement of visual images. After constructing the model, a generalization test confirmed that the model is sufficiently accurate to estimate the instantaneous presence. Hence, the model should contribute to the development of a presence meter.

  • Supervised Denoising Pre-Training for Robust ASR with DNN-HMM

    Shin Jae KANG  Kang Hyun LEE  Nam Soo KIM  

     
    LETTER-Speech and Hearing

      Pubricized:
    2015/09/07
      Vol:
    E98-D No:12
      Page(s):
    2345-2348

    In this letter, we propose a novel supervised pre-training technique for deep neural network (DNN)-hidden Markov model systems to achieve robust speech recognition in adverse environments. In the proposed approach, our aim is to initialize the DNN parameters such that they yield abstract features robust to acoustic environment variations. In order to achieve this, we first derive the abstract features from an early fine-tuned DNN model which is trained based on a clean speech database. By using the derived abstract features as the target values, the standard error back-propagation algorithm with the stochastic gradient descent method is performed to estimate the initial parameters of the DNN. The performance of the proposed algorithm was evaluated on Aurora-4 DB, and better results were observed compared to a number of conventional pre-training methods.

  • Uniqueness Theorem of Complex-Valued Neural Networks with Polar-Represented Activation Function

    Masaki KOBAYASHI  

     
    PAPER-Nonlinear Problems

      Vol:
    E98-A No:9
      Page(s):
    1937-1943

    Several models of feed-forward complex-valued neural networks have been proposed, and those with split and polar-represented activation functions have been mainly studied. Neural networks with split activation functions are relatively easy to analyze, but complex-valued neural networks with polar-represented functions have many applications but are difficult to analyze. In previous research, Nitta proved the uniqueness theorem of complex-valued neural networks with split activation functions. Subsequently, he studied their critical points, which caused plateaus and local minima in their learning processes. Thus, the uniqueness theorem is closely related to the learning process. In the present work, we first define three types of reducibility for feed-forward complex-valued neural networks with polar-represented activation functions and prove that we can easily transform reducible complex-valued neural networks into irreducible ones. We then prove the uniqueness theorem of complex-valued neural networks with polar-represented activation functions.

  • A Cascade System of Dynamic Binary Neural Networks and Learning of Periodic Orbit

    Jungo MORIYASU  Toshimichi SAITO  

     
    PAPER

      Pubricized:
    2015/06/22
      Vol:
    E98-D No:9
      Page(s):
    1622-1629

    This paper studies a cascade system of dynamic binary neural networks. The system is characterized by signum activation function, ternary connection parameters, and integer threshold parameters. As a fundamental learning problem, we consider storage and stabilization of one desired binary periodic orbit that corresponds to control signals of switching circuits. For the storage, we present a simple method based on the correlation learning. For the stabilization, we present a sparsification method based on the mutation operation in the genetic algorithm. Using the Gray-code-based return map, the storage and stability can be investigated. Performing numerical experiments, effectiveness of the learning method is confirmed.

  • A Breast Cancer Classifier Using a Neuron Model with Dendritic Nonlinearity

    Zijun SHA  Lin HU  Yuki TODO  Junkai JI  Shangce GAO  Zheng TANG  

     
    PAPER-Biocybernetics, Neurocomputing

      Pubricized:
    2015/04/16
      Vol:
    E98-D No:7
      Page(s):
    1365-1376

    Breast cancer is a serious disease across the world, and it is one of the largest causes of cancer death for women. The traditional diagnosis is not only time consuming but also easily affected. Hence, artificial intelligence (AI), especially neural networks, has been widely used to assist to detect cancer. However, in recent years, the computational ability of a neuron has attracted more and more attention. The main computational capacity of a neuron is located in the dendrites. In this paper, a novel neuron model with dendritic nonlinearity (NMDN) is proposed to classify breast cancer in the Wisconsin Breast Cancer Database (WBCD). In NMDN, the dendrites possess nonlinearity when realizing the excitatory synapses, inhibitory synapses, constant-1 synapses and constant-0 synapses instead of being simply weighted. Furthermore, the nonlinear interaction among the synapses on a dendrite is defined as a product of the synaptic inputs. The soma adds all of the products of the branches to produce an output. A back-propagation-based learning algorithm is introduced to train the NMDN. The performance of the NMDN is compared with classic back propagation neural networks (BPNNs). Simulation results indicate that NMDN possesses superior capability in terms of the accuracy, convergence rate, stability and area under the ROC curve (AUC). Moreover, regarding ROC, for continuum values, the existing 0-connections branches after evolving can be eliminated from the dendrite morphology to release computational load, but with no influence on the performance of classification. The results disclose that the computational ability of the neuron has been undervalued, and the proposed NMDN can be an interesting choice for medical researchers in further research.

81-100hit(287hit)